Disclosure Control Methods and Information Loss for Microdata

نویسندگان

Josep Domingo-Ferrer

Vicenç Torra

چکیده

Statistical disclosure control (SDC) seeks to modify statistical data so that they can be published without giving away confidential information that can be linked to specific respondents. The challenge for SDC is to achieve this modification with minimum loss of the detail and accuracy sought by database users. SDC methods for microdata are usually known as masking methods, of which there is a wide range. From the point of view of their operational principles, current masking methods fall into the following two categories (Willenborg and De Waal 2001): • Perturbative. The microdata set is distorted before publication. In this way, unique combinations of scores in the original dataset may disappear and new unique combinations may appear in the perturbed dataset; such confusion is beneficial for preserving statistical confidentiality. The perturbation method used should be such that statistics computed on the perturbed dataset do not differ significantly from the statistics that would be obtained on the original dataset. • Nonperturbative. Nonperturbative methods do not alter data; rather, they produce partial suppressions or reductions of detail on the original dataset. Global recoding, local suppression, and sampling are examples of nonperturbative masking.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Quantitative Comparison of Disclosure Control Methods for Microdata

As described in Chapter 5, there is a plethora of statistical disclosure control (SDC) methods to protect microdata. This chapter provides guidance in choosing a particular SDC method by comparing some of the methods discussed in Chapter 5 on the basis of both information loss and disclosure risk. Information loss can be readily quantified using analytical measures (either generic or data-use-s...

متن کامل

Source Data Perturbation in Statistical Disclosure Control

When tables of quantitative data are generated from a datafile, the release of those tables should not reveal information concerning individual respondents. This disclosure of individual respondents in the microdata file can be prevented by applying disclosure control methods at the table level, but this may create inconsistencies across tables. Alternatively, disclosure control methods can be ...

متن کامل

Automatic Generation of Masked Microdata

Disclosure Control is the discipline concerned with the modification of data containing confidential information about individual entities, such as persons, households, businesses, etc. in order to prevent third parties working with these data from recognizing entities in the data and thereby disclosing information about these entities. In very broad terms, disclosure risk is the risk that a gi...

متن کامل

Post-Masking Optimization of the Tradeoff between Information Loss and Disclosure Risk in Masked Microdata Sets

Previous work by these authors has been directed to measuring the performance of microdata masking methods in terms of information loss and disclosure risk. Based on the proposed metrics, we show here how to improve the performance of any particular masking method. In particular, post-masking optimization is discussed for preserving as much as possible the moments of first and second order (and...

متن کامل

An approximate microaggregation approach for microdata protection

Microdata protection is a hot topic in the field of Statistical Disclosure Control, which has gained special interest after the disclosure of 658000 queries by the America Online (AOL) search engine in August 2006. Many algorithms, methods and properties have been proposed to deal with microdata disclosure. One of the emerging concepts in microdata protection is k-anonymity, introduced by Samar...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2001

Disclosure Control Methods and Information Loss for Microdata

نویسندگان

چکیده

منابع مشابه

A Quantitative Comparison of Disclosure Control Methods for Microdata

Source Data Perturbation in Statistical Disclosure Control

Automatic Generation of Masked Microdata

Post-Masking Optimization of the Tradeoff between Information Loss and Disclosure Risk in Masked Microdata Sets

An approximate microaggregation approach for microdata protection

عنوان ژورنال:

اشتراک گذاری